Can English perceivers match Cantonese auditory and visual prosody?
نویسندگان
چکیده
The prosody of an utterance can be varied by changing F0, duration and amplitude. Such changes are typically accompanied by variation in the talker’s face/head motion (visual prosody). For native language utterances, people can match auditory and visual prosody accurately. We tested whether English perceivers can do this with an unfamiliar language, Cantonese, which differs from English specifically with regard to suprasegmental properties (e.g., different rhythm type; use of lexical tone). These differences may make extraction of prosody difficult, because they distract English perceivers and/or because they affect the way prosody is realized. However, AV cues for prosody may be similar across languages and sufficiently salient to overcome the suprasegmental differences. We tested native AustralianEnglish participants (N=27) with 50 Cantonese sentences spoken as questions, narrowly focused or broad focused utterances by two native Cantonese talkers. Participants completed a same-different matching task for auditoryauditory (AA); visual-visual (VV) and auditory-visual (AV) pairs. Each pair type consisted of the same sentence and talker, but different tokens. Matching performance was above chance for all conditions: AA > AV = VV. Results are discussed in terms of how auditory and visual prosody is conveyed and how this may be affected by language properties.
منابع مشابه
Cross-modality matching of linguistic and emotional prosody
Talkers can express different meanings or emotions without changing what is said by changing how it is said (by using both auditory and/or visual speech cues). Typically, cue strength differs between the auditory and visual channels: linguistic prosody (expression) is clearest in audition; emotional prosody is clearest visually. We investigated how well perceivers can match auditory and visual ...
متن کاملThe Effect of Tonal Information on Auditory Reliance in the McGurk Effect
The McGurk effect occurs when conflicting auditory and visual speech information result in an emergent percept. The incidence of the McGurk effect is greater for speakers of English than Japanese, and in turn for speakers of Japanese than Cantonese. Sekiyama postulates that this is because speakers of tonal languages rely more upon auditory than visual information in speech perception. Here thi...
متن کاملVisual discrimination of cantonese tone by tonal but non-Cantonese speakers, and by non-tonal language speakers
A previous study by the first two authors suggests there is visual information for tone perception: under certain conditions Cantonese speakers are able to identify spoken words as one of six Cantonese words differing only in tone on the basis of lip and face movements at a rate better than chance [1]. Here, non-native (tonal, Thai, and non-tonal, English) language speakers were tested on a dis...
متن کاملPerceiving visual prosody from point-light displays
This study examined the perception of linguistic prosody from augmented point-light displays that were derived from motion tracking six talkers producing different prosodic contrasts. In Experiment 1, we determined perceivers’ ability to use these abstract visual displays to match prosody across modalities (audio to video), when the non-matching visual display was segmentally identical and diff...
متن کاملIdentifying visual prosody: where do people look?
Talkers produce different types of spoken prosody by varying acoustic cues (e.g., F0, duration, and amplitude), also making complementary head and face movements (visual prosody). Perceivers can categorise auditory and visual prosodic expressions at high levels of accuracy. Research using eyetracking trained participants to recognise the visual prosody of two-word sentences and found that the u...
متن کامل